Search CORE

38 research outputs found

Automatic construction of neural networks for special purpose speech recognition systems

Author: Bodenhausen Ulrich
Hild Hermann
Publication venue
Publication date: 02/08/2007
Field of study

KITopen

Speaker-independent connected letter recognitions with a multi-state time delay neural network

Author: Hild Hermann
Waibel Alex
Publication venue
Publication date: 02/08/2007
Field of study

KITopen

A 3-D error diffusion dither algorithm for half-tone animation on bitmap screens

Author: Hild Hermann
Pins Markus
Publication venue
Publication date: 02/08/2007
Field of study

KITopen

Variations on a dither algorithm

Author: Hild Hermann
Pins Markus
Publication venue
Publication date: 02/08/2007
Field of study

KITopen

Recognition of spelled names over the telephone

Author: Hild Hermann
Waibel Alex
Publication venue
Publication date: 02/08/2007
Field of study

KITopen

Language models for a spelled letter recognizer

Author: Betz Martin
Hild Hermann
Publication venue
Publication date: 02/08/2007
Field of study

KITopen

Multi-speaker / speaker-independent architectures for the multi-state time delay neural network

Author: Hild Hermann
Waibel Alex
Publication venue
Publication date: 02/08/2007
Field of study

KITopen

A comparison of ID3 and backpropagation for English text-to-speech mapping

Author: Bakiri G.
Dietterich T.G.
Hild Hermann
Publication venue
Publication date: 02/08/2007
Field of study

KITopen

Recommended from our members

Variations on ID3 for text-to-speech conversion

Author: Hild Hermann
Publication venue: Oregon State University. Department of Computer Science
Publication date
Field of study

ScholarsArchive@OSU

Recommended from our members

Variations on ID3 for text-to-speech conversion

Author: Hild Hermann
Publication venue: Oregon State Unversity
Publication date
Field of study

Summary of results. After implementing the plain ID3 algorithm, I experimented with various modifi cations. Two improvements of the process of finding a legal phoneme/stress could be made by using statistical information about the letter to phoneme/stress-mapping in the training set. Adding the CHI-SQUARE test to the ID3 algorithm was successful in terms of reducing the size of the trees, but could not enhance the performance. However, on the random k-DNF concepts the CHI-SQUARE test was very e ffective. Learning the stresses from the phonemes instead of from the English text showed the interesting characteristic that the overall performance improved, although the stresses got worse. When a separate set of trees was learned for each letter, some letters improved and others didn't, while the stress performance suffered generally. By staying with separate trees only for the "winner letters" and learning the stresses from the phonemes allowed for further gains in performance. In the "multi-level-ID3" experiment the goal was to learn on a higher level in some cases. Common letter combinations such as er or ion were extracted from the data and treated as an entity during the learning phase. Trying to learn on this level was not as successful as expected, probably because the number of training examples became too small. In fact, it turned out that using the common letter-blocks just to constrain the outcome found with the standard letter-by-letter classifi cation was even more successful. A combined ID3/NN-approach kept subsets of the training examples as exceptions lists in some of the leaves of the tree. This method was not as successful as the regular way of generalizing such leaves with the majority rule. The postprocessing of rules extracted from the decision trees as suggested by [Quin- lan87] enhanced performance for the boolean k-DNF concepts but was not successful on the NETtalk data. The selection of examples for the training set seems not to have a big influence on the performance. Trees trained on ten randomly selected 1000 word training sets show only minor deviations in their performance. However, all ten randomly selected training sets result in much bigger trees then the most common thousand word training set. Here is a summary of the achieved improvements: ID3-TREES WORDS LETTERS (PHON/STRESS) BITS LEAVES DEPTH ------------------------------------------------------------------------ (0) Neural Net TEST: 12.1 67.1 77.6 79.6 96.1 - - (1) ID3, plain TEST: 9.1 60.8 75.4 74.4 95.8 165.3 23.1 (2) ID3, b.g.(3) TEST: 12.9 67.2 79.9 78.0 96.2 165.3 23.1 (3) ID3 TEST: 14.3 68.2 80.5 76.5 96.2 - - (4) ID3 TEST: 15.8 69.3 80.3 78.6 96.2 165.3 23.1 (0) Neural Net, back-propagation [Dietterich89]. (1) Plain ID3, with the straightforward "best guess" strategy. (2) ID3, with an improved "best guess" strategy. (3) Combined method of learning some letters individually and classifying the stresses from the phonemes (4) Learning on a higher level with common letter blocks The neural net (0) outperforms the comparable ID3 result (1). None of the ID3 versions can learn the stresses as well as the back-propagation algorithm. It would be interesting to see how much performance one could gain if the methods used to improve ID3 were applied to the neural net. One reason for the better performance of the neural nets might be fact that they learn all bits simultaneously

ScholarsArchive@OSU